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Abstract — It is proved that the information divergence statistic 
is infinitely more Bahadur efficient than the power divergence 
statistics of the orders a > 1 as long as the sequence of 
alternatives is contiguous with respect to the sequence of null- 
hypotheses and the the number of observations per bin increases 
to infinity is not very slow. This improves the former result in 
Harremoes and Vajda (2008) where the the sequence of null- 
hypotheses was assumed to be uniform and the restrictions on 
on the numbers of observations per bin were sharper. Moreover, 
this paper evaluates also the Bahadur efficiency of the power 
divergence statistics of the remaining positive orders < a < 1. 
The statistics of these orders are mutually Bahadur-comparable 
and all of them are more Bahadur efficient than the statistics of 
the orders a > 1. A detailed discussion of the technical definitions 
and conditions is given, some unclear points are resolved, and 
the results are illustrated by examples. 

Index Terms — Bahadur efficiency, consistency, power diver- 
gence, Renyi divergence. 

I. Introduction 

PROBLEMS of detection, classification and identification 
are often solved by the method of testing statistical 
hypotheses. Consider signals Y1.Y2, ...,Y n collected from a 
random source independently at time instants i = 1,2, ...,n. 
Signal processing usually requires digitalization based on 
appropriate quantization. Quantization of the signal space 
y into k disjoint cells (or bins) y n i, y n 2, •••> 3^nfc reduces 
the signals Yi,Y2,...,Y n into simple /c-valued indicators 
In(Yi), In(Y2), In(Y n ) of their cover cells. Various hy- 
potheses about the data source represented by probability 
measures Q n on y are transformed by the quantization into 
discrete probability distributions 

Qn = {q n i = Q(y n i),—,Qnk = Q{y n k)) 

on the quantization cells where for no quantization cell q n j = 
0. These hypothetical distributions need not be the same as the 
true distributions P n = (p nl = P(y nl ), ...,p n k = f(34fc))- 
The latter distributions are usually unknown but, by the law 
of large numbers, they can be approximated by the empirical 
distributions (vectors of relative cell frequencies) 

X n i X n k \ _ x„ 

n 



P„ 



Pnl 



;Pnk 



(1) 



where X n j is the numbers of the signals Yi,Y%, Y n in y n j 
Formally, 



X„ 3 = J2 1 {Y i ey nj } = E 1 {i n {Y i )=j}, 1 < 3 < k 

i=l i=l 



(2) 
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where 1a denotes the indicator of the event A. The problem 
is to decide whether the signals Y\ , Y2 , . . ., Y n are generated 
by the source (y, Q) on the basis of the distributions P n , Q n . 
A classical method for solving this problem is the method of 
testing statistical hypotheses in the spirit of Fisher, Neyman 
and Pearson. In our case the hypothesis is 



U-Pn 



(3) 



and the decision is based either on the likelihood ratio statistic 
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Ti, n = 2Y,X nj ln 

j — l TlQnj 



or the Pearson x 2 -statistic 
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* {Xnj - nq nj ) 2 
j—l '"Qnj 



(4) 



(5) 



in the sense that the hypothesis is rejected when the statistic 
is large, where "large" depends on the required decision error 
or risk (TJ. 

It is easy to see (c.f. ( TTST l. (TBI below) that the classical test 
statistics ©, (01 are of the form 



f Q ,„ = 2nD a , n = 2nD a (p„, Qn) , a G {1, 2} 



(6) 



where D a (P, Q) for arbitrary a > and distributions 
P = (pi,...,p k ), Q = {qi,—,qk) denotes the divergence 
D<j> a (P Q) °f Csiszar [2| for the power function 



, . . t a - a{t - 1) - 1 
<t>a{t) = r TT when a ^ 1 



and 



The power divergences 
1 



a{a — 1) 
(*) 
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a— >1 
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D a (p, Q) - — ^— Epfqj- a 1 

a{a - 1) \ j=1 ■> 



ol + \ (9) 



or the one-one related Renyi divergences Q 

D a (P\\Q) - In E^gj-" a 1 

ot — 1 j—l 

with the common information divergence limit 

D l (PQ)=D 1 (P\\Q)= EPjln^i 

i=i H 



(10) 
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are often applied in various areas of information theory. In the 
present context of detection and identification one can mention 
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e.g. the work of Kailath (4) who used the Bhattacharryya 
distance 

B (P, Q) = -In £ (p 3 q 3 ) 1/2 = \d x/2 (P\\Q) 

3 = 1 1 

which is one-one related to the Hellinger divergence. 

In practical applications it is important to use the statis- 
tic -D Qo pt,n which is optimal in a sufficiently wide class of 
divergence statistics Z3 Q „containing the standard statistical 
proposals D\. n and L>2.n appearing in (|6). We addressed 
this problem previously |5)-0. Our solution confirmed the 
classical statistical result of Quine and Robinson [8| who 
proved that the likelihood ratio statistic D\. n is more efficient 
in the Bahadur sense than the x 2 -statisfic £>2,n and extended 
the results of Beirlant et al. [|9) and Gyorfi et al. ITU) 
dealing with Bahadur efficiency of several selected power 
divergence statistics. Namely, we evaluated the Bahadur 
efficiencies of the statistics D na in the domain a > 1 for 
the numbers k = k n of quantization cells slowly increasing 
with n when the hypothetical distributions Q n are uniform 
and the alternative distributions P n are contiguous in the 
sense that lirrin^oo D a (P n , Q n ) exists and identifiable in the 
sense that this limit is positive. We found that the Bahadur 
efficiencies decrease with the power parameter in the whole 
domain a > 1. In the present paper we sharpen this result by 
relaxing conditions on the rate of k n and extend it considerably 
by admitting non-uniform hypothetical distributions Q n and 
by evaluating the Bahadur efficiencies also in the domain 
< a < 1. 



II. Basic model 

Let M(k) denote the set of all probability distributions P = 

(Pj '• 1 < j < k) and 

M{k\n) = {Pe M(k) : nP G {0, 1, . . .} k } 

its subset called the set of types in information theory. We 
consider hypothetical distributions Q n = (q n j : 1 < j < 
k) G M(k) restricted by the condition q n j > and arbitrary 
alternative distributions P n = (p„j : 1 < j < k) G M(k). 
The {0, 1, . . .} fc -valued frequency counts X„ with coordinates 
introduced in (fJJ are multinomially distributed in the sense 

X n ~Multfc(n,P„),n = 1,2,.... (12) 

Important components of the model are the empirical distri- 
butions P n e M(k\n) defined by ((TJ. Finally, for arbitrary 
P G M(k) and arbitrary Q G M(k) with positive coordinates 
we consider the power divergences For their prop- 

erties we refer to 1 1 1 1 1 1 3 j . In particular, for the empirical 
and hypothetical distributions P n ,Q n we consider the power 
divergence statistics D a n = D a 

(P n ,Q n ) (cf. ©)defined 

by ©, CO) for all a > 0. 



Example 1: For a = 2, a = 1 and a = 1/2 we get the 
special power divergence statistics 

n 1 \ " (Pnj — qn]) 2 1^ ncn 

3 = 1 J 

n ^ 

Dx,n = Y^Pnjte— = ^-Ti, w , (14) 
■ Qnj 
3 = 1 

n 2 
D 1/2 , n = 2j2{Pn / J 2 -ql / *) d5) 

3 = 1 

For testing the hypothesis TL of (O are usually used the re- 
scaled versions 

T atn = 2nD a . n (16) 

distributed under H. asymptotically \ 2 with k — 1 degrees of 
freedom if k is constant and asymptotically normally if k = k n 
slowly increases to infinity IT4ll . lfl5l and references therein] 
. The statistics ( TT3b and ( TT~4-b rescaled in this manner were 
already mentioned in (0 and (|4j. In ( fT3T > is the Hellinger 
divergence statistics rescaled by 2n is known as Freeman- 
Tukey statistic 

k 

f 1/2i „ - 2nD 1/2<n = 4^((X nj ) 1/2 - (nq nj ) 1/2 ) 2 . (17) 

3 = 1 

a) Convention: Unless the hypothesis TL is explicitly 
assumed, the random variables, convergences and asymptotic 
relations are considered under the alternative A. Further, 
unless otherwise explicitly stated, the asymptotic relations are 
considered for n — > oo and the symbols of the type 

s n — > s and s n (X„) s 

denote the ordinary numerical convergence and the stochastic 
convergence in probability for n — > oo. 

In this paper we consider the following assumptions. 

Al: The number of cells k — k n < n of the distribu- 
tions from M(k), M(k\n) depends on the sample 
size n and increases to infinity. In the rest of the 
paper the subscript n is suppressed in the symbols 
containing k. 

A2: The hypothetical distributions Q n = (q n j > : 1 < 
j < k) are regular in the sense that maxj q n j —> 
for n->oo and that there exists g > such that 

Qnj > — for all 1 < j < fc and n = 1,2, . . . . 

(18) 

A3a: The alternative A : (P n : n — 1,2,...) is 
identifiable in the sense that there exits < A Q < oo 
such that 

D atn d = D a (P n , Q n ) > A a under A. (19) 

Under A2 

— In q n j < In — and In q n j < In — . (20) 
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Further, logical complement to the hypothesis H. is the alterna- 
tive denoted by A. By (O, under A the alternative distributions 
P n differ from Q n . Assumption A3a means that the alternative 
distributions are neither too close to nor too distant from Q n 
in the sense of D a -divergence for given a > 0. Since for all 
n = l,2, ... 

D a , n — D a (Q n , Q n ) eO so that A Q = under % 

it is clear that the hypothesis^ is under Al, A2,A3a dis- 
tinguished from the hypothesis T-L by achieving a positive 
D a -divergence limit A Q . In what follows we use the abbrevi- 
ated notations 

A(a) = {Al,A2,A3a}, (21) 

A(a x , a 2 ) = {Al,A2,A3ai, A3a 2 } (22) 

for the combinations of assumptions. 

Definition 2: Under A(a) we say that the statistic D a , n is 
consistent with parameter A a appearing in ( fT9l if 



and 



i.e. if D a 



Dan 



under A 



under % 



(23) 



(24) 



—y A a under both A and H. If d24"l i is replaced 
by the stronger condition that the expectation ED a , n tends to 
zero under TL, in symbols 



Dan 



o. 



(25) 



then D a . n is said strongly consistent. 



Definition 3: We say that the statistic D a>n is Bahadur 
stable if there is a continuous function with a Bahadur relative 
function g a : ]0,oo[ 2 — > ]0, oo[ such that the probability of 
error function 



e a , n (A) = P ( D a<n > A 



ft 



corresponding to the test rejecting H when D, 
for all Ai, A 2 > the relation 



A > (26) 
,, n > A satisfies 



In e Q ,„(Ai) 



f? Q (Ai,A 2 ). 



In e Qjn (A 2 ) 

If this condition holds then g a is called the Bahadur relative 
function. 

Obviously, the Bahadur relative functions are multiplicative 
in the sense 

Qa (Ai, A a ) Q a (A 2 , As) = £> Q (Ai, A 3 ) . 

Statistics that are Bahadur stable have the nice property that 
the asymptotic behavior of the error function e a>n (A) is 
determined by its behavior for just a single argument A* > 0. 
Indeed, if D a , n is Bahadur stable and if we define for a fixed 
A* > the sequence 



then for all A > 

c*(n) 



lne Q ,„(A*) 
lne Q ,,„(A) — >g a (A,A*) for all A > 0. 



Moreover, if the expressions — c a (n)/n In e an ( A) converge 
for a sequence c a (n) then the ratio c a (n) /c* (n) tends to a 
constant. 

£>) Motivation of the next definition: Suppose that con- 
dition A(ai,a 2 ) holds and denote for each a £ {«i,a 2 } 
and n — 1,2,... by A a + e a n me critical value of the 
statistics D ai , n leading to the rejection of H with a fixed 
power < p < 1. In other words, let 

p = P (i) a ,n > A a + e a ,nj for all n = 1, 2, . . . 

where the sequence e a , n — £a,n(p) depends on the fixed 
p. Since the assumed consistency of D a n implies that e Q „ 
tends to zero, the corresponding error probabilities e ain (A Q 

&a.n) P {^Da,n ^ A a -\- £a,n 



(A Q ) = P [D ai n > A 
c a (n) 



7ij can be approximated by 
) . By (El, 



n 



lne Qin (A Q ) — y g a (A a ). 



Hence the error e Q , 1) „(A Q , 1 ) of the statistic D ai . n tends to zero 
with the same exponential rate as e Q2 , m „(A Q2 ) achieved by 
D a2 , mn for possibly different sample sizes m n ^ n with the 



property m r , 



oo if the corresponding error exponents 



Cai (n) 



and g a2 (A Q2 



,(m n ) 



tend to infinity with the same rate in the sense 

m n _ 3 Ql (A Ql ) n 



c a2 (m n ) g a2 (A a2 ) c ai (n) 



(l + o(l)). 



(28) 



(29) 



The sample sizes m n and n needed by the statistics D a2 , n 
and D ai>n to achieve the same rate of convergence of errors 
are thus mutually related by the formula 



m n _ g ai (A ai ) Cg 2 (m n ) 
n g a2 (A a2 )' c ai {n) 



(30) 



Obviously, the statistic D ain is asymptotically less or more 
efficient than D a2 ^ n if the ratio m n /n of sample sizes needed 
to achieve the same rate of convergence of errors to zero 
tends to a constant larger or smaller than 1, This motivates 
the following definition which refers to the typical convergent 
situation 



(m n ) 



c ai (n) 



c q 2 /qi for SOme ^ c a 2 /a! < OO- (31) 



Definition 4: If there is a continuous function 

g a ■ ]0, oo[ — >• ]0, oo [ 
and a sequence c a (n) such that for all x > the error function 

e a , n (x) = P (D a>n > x\H) , x>0 (32) 

satisfies for all x > the relation 

e«(n) 



■lne a) „(a;) — y g a (x) 



(33) 
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then g a is called Bahadur function of the statistic D a ^ n 
generated by c a (n). If ( |33l l is replaced by the condition 

Cot (^) 

lne Q n (x + £„) — > g a (x) for arbitrary e n — > 

n 

(34) 

then the function g a is strongly Bahadur. 

Definition 5: Let us assume that A(ai,aa) holds and that 
for each a 6 {ai,ck2} the statistic D n ^ a is consistent with 
parameter A a and has a Bahadur function g a generated by a 
sequence c a (n) such that (|3TT i is satisfied. Then the Bahadur 
efficiency of D ai ,n with respect to D a2t7l is the number from 
the interval [0, oo] defined by the formula 



BE 



(D ai 



71 J ^Q2,n 



gai( A ai! 

(A 



(35) 



Hereafter we shall consider also the slightly modified con- 
cept of Bahadur efficiency. 

Definition 6: Let in addition to the assumptions of Defini- 
tion [5] the statistics D aiiTl , -D a2jn be strongly consistent and 
the functions g ai , g a2 strongly Bahadur. Then the Bahadur 
efficiency ( 1351 l is said to be Bahadur efficiency in the strong 
sense. 

c) Motivation of Definition [6} Let the assumptions of 
this definition hold then for each a G {ai, c^}, and u > 
the function 



L a ,n{u) — P I T a l 



U 



> u 



H) , (cf.| 



denotes the level of the error of the statistic 



T 



H 



2n D, 



n 



for critical value u > 0. By the assumed strong consistency 
of D a „, 



U 



2n 



(cf.GS). 



This means that the sequence c a (n) generating the strongly 
Bahadur g a satisfies for alH > the relation 



c a (n) 



InP 



(f 



> E \ T a , n + 2nt Hj — > g a (t) . 

(cf. (01) 

Consequently, by the argument of Quine and Robinson (8] p. 
732], 

c a (n) 



lirrir. 



■lni ain (T a n ) — » 3a (A a ). 



Hence J8|, the error level L ai , n (T ai _ n ) of the statistic 
T ai . n = 2nD ai n is asymptotically equivalent to the error 
level L a2 ,m n {T a2 ,m n ) of the statistic T Q2j „ ln = 2m n D a2 , mn 
achieved by a sample size m n if the comparability d29l > takes 
place or, in other words, if the sample sizes n and m n are 
mutually related by ( f30b . In other words, the concept of 
Bahadur efficiency introduced in this paper coincides under 
the stronger assumptions of Definition [6] with the Bahadur 
efficiency of Quine and Robinson j8j. 



Harremoes and Vajda [5| assumed the same strong consis- 
tency as in Definition [6] but introduced the Bahadur efficiency 
by the slightly different formula 



BE I Dai^n 1 ^Q2,n 



ggi(Aai) 
9a 2 (A a2 ) 



-'aa/ai 



wher^U 



lim £SSM. 

n — >oo c Ql (n) 



(36) 



(37) 



III. Consistency 

In this section we study the consistency of the class of power 
divergence statistics D a (P n , Q n ), a > 0. In the domain a < 
this consistency was studied in the particular case of uniform 
Q by Harremoes and Vajda |f6ll . 

Theorem 7: Let distributions Q n E M (k) satisfy the as- 
sumption A(a). Assume that / is uniformly continuous. Then 
the statistic Df(P n ,Q n ) is strongly consistent provided 



n 
k 



(38) 



Proof: Under % we have Df(P n ,Q n ) = Df(Q n ,Q n ) 
0. Hence it suffices to prove 



l A a, 



under both H and A 



(39) 



for A Qi „ = Df(P n ,Q n ) - D f (P n ,Q n ). For simplicity we 
skip the subscript n in the symbols P n ,P n , and Q n , i.e. we 
substitute 

P n =P= (Pj : 1 < 3 < k), P n =P= (;p, : 1 < j < k). 

(40) 

This leads to the simplified formula A Q .„ = Df(P,Q) — 
Df(P,Q). We can without loss of generality assume that 
Df(P,Q) is constant not only under % (where the constant 
is automatically 0) but also under A (where the assumed 
detectability implies the convergence Df(P,Q) — > A a for 
< A a < 00). In this asymptotic sense we use the equalities 



and 



Df(P,Q) 



An 



a (a — 1) 



D f (P,U)-A c 



(41) 
(42) 



Choose some < s < 1 and define 



for t > s, 



fit) 

f (s) + /+ (s) (t-s) for < t < s. 



Then 



o</(t)-/ s (t)<f(o)-r (o) 

so that © implies 

< D f (P, Q) - D f . (P, Q)<f (0) - / s (0) • 

'Due to a missprint, cti and o?2 were interchanged behind the limit in |5 
Eq. 30], but the formula was used in the correct form 436i . In the Appe ndix 
we prove that the conclusions made on the basis of the original formula {36) 
hold unchanged under the present precised formula 135) . 
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The function f s is Lipschitz with the Lipschitz constant A = 

max{|/;( S )|,|/'(oo)|} i.e. \f - / (t 2 )| < A|t x -t 2 | 
for all t 1 ,t 2 > 0. Then 



D f .(P n ,Q)-D f .(P n ,Q) 




where in the last step we used the Schwarz inequality. Since 



(Pj - Pi) = Pji 1 ~ i>j)/« < Pih 



(43) 



it holds 



D f s(P n ,Q)-D f s(P n ,Q) 
k 



< A E 



E 

3=1 



Pi 



1/2 



< A 



1/2 



Consequently, 
E 



D f (P n ,Q)-D f (P n ,Q) 

<2(/(0)-/ s (0)) + A(fc/«) 1/2 

so that under d49t 



lim sup E 



Df(P„,Q n ) — Df(P n ,Q n ] 



<2(/(o)-r (o)). 



This holds for all s > 0. Since / (0) - f s (0) — > for s | 0, 
we see that in this case d38l implies ( |39l l. ■ 
The interpretation of condition [38] is that the mean number 
of observations per bin should tend to infinity under H. Note 
that this condition does not exclude that we will observe empty 
cells. 

Our results are concentrated in Theorem [9] below. Its proof 
uses the following auxiliary result. 



Lemma 8: For x, y > and 1 < a < 2 it holds 

L a (x,y) < (j) a (y) - <j> a (x) < U a (x, y) 



(44) 



where 



and 



(45) 



L a (x,y) = (y- x)(p' a (x) 

U a (x, y) = L a {x, y) + -x a ~ 2 (y - xf . (46) 

a 



Proof: First assume 1 < a < 2. Since ^x a 2 (y — x) z 
is nonnegative, it suffices to prove 



</>o(y) > <t>a{x) + <t>' a (x) (y - x) 



(47) 



and 



Mv) < <t> a {x) + 4>' a {x) (y-x) + -x a - 2 (y - xf . (48) 

a 

But Inequality d47l > is evident since the function y — > 4> a (y) 
is convex. We shall prove that the function 

f(v) = 



K (y) - K (x) + <t>' a {x) (y-x) + -x a - 2 (y - x)' 
V " 

is non-positive. First we observe that /(0) = f(x) 
differentiating / (y) we get 



0. By 



f'(y) 



If 



- 1 



a - 1 



<t>' a {x) + -x a - 2 (y-x) 



so that /' (x) = 0. Differentiating once more we get 

2 



f"(y) 



y <x-2_l x <*-2 

a 

def 



Thus f"(y) > for y < x a = (a/2) 5 ^ x and f"(y) < 
for y > x a . Since x a < x and f(y) is concave on [x a , 1], 
it is maximized on this interval at y — x where f(x) = 0. 
Thus / (y) < on this interval and in particular f(x a ) < 0. 
This together with /(0) = and the convexity of / (y) on 
the interval [0, x a ] implies / (y) < for y £ [0, a;]. This 
completes the proof of the non-positivity of / (y), i.e. the proof 
of (l48l . The cases a = 2 and a — 1 follow by continuity. ■ 

The main result of this section is the following theorem. 

Theorem 9: Let distributions Q n € M (k) satisfy the as- 
sumption A(a). Then the statistic D a (P n ,Q n ) is strongly 
consistent provided 



< a < 2 
and consistent provided 



a > 2 



n 

and — 

k 



and 



oo 



k log k 



(49) 



(50) 



Proof: We shall use the same notation as in the proof of 
Theorem 13 In the proof we treat separately the cases 

0: < a< 1, 

|H|: 1< a < 2, 
lull: a — 1. 
Hv]: a > 2. 

Case r (0 < a < I): This follows from Theorem[7]because 
x — > 4> a (x) is uniformly continuous. 

Case ii (1 < a < 2): Here we get from d42l i 



3 = 1 



Qj Pa 



(51) 
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so that Lemma [8] implies 



Next we bound the first term. 



f>£ Q <A Q , n < 



^ V <h IjJ j^y " V'/;/ V'/, </, 



and 



E^ -ViWa 



3 = 1 



We take the mean and get 



k a— 2 N 2 
\ - (Pj ~Pj) 

3 = 1 ^3 



E®> -Pj)<t>' a 



3=1 



< E 



k 



E^ -PiWa 



1/2 



1/2 



V 1i J \ Qj 



i,j=i 



1 



+ 



1/2 



1/2 



■E |A ajn | 



< 



- Pj')0a 



fc a-2 
V- Pj 



U < 1 



(Pj - Pj')' 



The terms on the right hand side are treated separately. 



i=i y J 



(p? - Pi) 2 ] _ * E (»Pj - m f 

a a a ~ x an 2 

_ sp PT 2 npj{l-pj) 

^ q a - x an 2 
3=1 q i 

3 = 1 \k) 
ua-1 *L 

< -=-5=r EiT 1 - 

anp a 1 £ — ' J 

3=1 



This equals 



i( Eti^(i-ft)(^(|)) 

|-E^i^P<Pj^ (t) 



1/2 



< 



i / E-=iP^K(t 



,1/2 



E^P*P>a(f) 



1/2 



This can be bounded as 



,1/2 



+ (E 4 p^(| 



1/2 



, / Pi 

.3i 



1/2 



The function P — > V fe if? 1 is concave so it attains its T , , , , , ■ ■ • . 

L -f3=i'"3 These bounds can be combined into 

maximum for P = (1/fc, 1/fe, ■ • • , 1/fc) . Therefore 



k n a-2 E 

EP? 

3 = 1 q 3 



(Pj -PjY 



< 



a 



■ -^i fc 



anp' 

1 k 

ap a ~ x n 



a-l 



E|A a J < 



1 k 2 



/ / Pi 



1/2 



(52) 



Under d49t the first term tends to zero as n — >• oo. The last 
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term does the same, which is seen from the inequalities Case iv (a > 2): By A2, 

k 



2 k 

- YV: 
n f— i 

2—1 



/ 2 D a (P,Q) = -^L_ - 1 

V a_1 J ^ i ((k\ a - lk 

a(a-l) \\g 



~~ n 1 (n — 1 \ 2 so that 



^/J 9i (lyLy- +1 ) < Wa - da + d (|)»- < 55 > 



n(a- 1) ^ V*/ \<7 



fc / \ a / i \ a — 1 



where we replaced D a (P, Q) by A = A Q in the sense of (RTt . 
Further, by the Taylor formula 

, \ ' t ' / v / • / " / / / 1(1 — .1. J / -i \ 



< z \^ „. ( ^1 V f 1 ^ " + 2 Further, by the Taylor formula 

- n {a-lf{ri W W*/ n(a-l)~ 



fc"- 1 2 (Pi\ 2 



-2> S 



n (a — l) 2 p Q_1 \9i / n (a — l) 2 where £j is between pj and pj. We shall look for a highly 

UQ _i f ni (n „ probable upper bound on pj. Choose any b > 1 and consider 



= i>: • xp, -Pi)+ - Pj y (56) 



2(a(a-l)A+l) 



(a-lfp - 1 n(a-l) 2 ' 



the random event 

E n j(b) = {pj > b max {pj,qj}}. 



Case Hi (a = I): For a = 1 in Inequality |52] we get We shaU prove that under ft hoMs 

fc / 2 * / p A 2 \ 1/2 MfeJ^PCU^ft))— >0. (57) 

E |Ai „| < ^ + I ^Jp« ( In J j ■ The components X, = X„j of the observation vector X n 

defined in Section 1 are approximately Poisson distributed, 

Using In pi < we find that last term on the right satisfies Po ( n Pj) . so that 

the relations p ^ ^ 6 max {pj, qj}) — P (Xj > nb max {pj, qj}) 

^ k < exp{— D\ (Po (b max {npj, nqj}) , Po (rip^))} 



„ XJ 151 ( ln ' Pl 2 lnK ln * + ln ' (53) for the divergence D X (P, Q) defined by ©-© with P, Q 

replaced by the corresponding Poisson distributions. But 



3=1 



= - Pi In 2 Pi - - Pi In Pi ln % + - V" p. In 2 % 
n r — ' ft . — ' ft z — ' 

3=1 3=1 3=1 

<-f Pl in 2 ft + ^ifw^-y; w in 2 Pl + ^-i. for a11 o < P j , % < i 

J _1 J -1 / 6 max {pj ;, } 



Di (Po (6ngj) , Po (rapj)) = npA (6) (58) 
for the logarithmic function <f>\ > introduced in @. Since 



3 



> </»x (6) > 1 for b > 1, 



The function a; — > a; In 2 a; is concave in the interval [0; e 2 ] V ^ ^ 

and convex in the interval [e" 1 ;!] . Therefore we we can it holds 
apply the method of [16| to verify that ^2i = iPim 2 Pi attains 

its maximum for a mixture of uniform distributions on k points max { n Pj > n 1js) > P° ( n Pj )) 

and on subset of k — 1 of these points. Thus > D\ (Po (6ngj) , Po (nqj)) . 



k , fc 



Consequently, 

-^p i ln 2 p J - < ^^7^37 In 2 f^J 7Tn(&) < P ®» - ^^fe' 9j}) 

i—l i—1 



_ fcln k ^ 21n fc < ^ cxp{— £>i (Po (bmax{npj, ngj}) , Po (rapj))} 
n (fc — 1) — n j 

and we can conclude that under (ggjl the first term in ([53j tends ~ 6Xp ^ - 01 ( Po ( &n,? 3 ) ' Po ) ) I 
to zero as n tends to infinity. Obviously, under ( |49] i also the 

second term in ( BTT i tends to zero so that the desired relation = / 4 ex P {~~ nc Lj4>i Q>)} ( c f- *ESJ) 
holds. J' 



< fcexp|-ri|(TJi (6) | = ^-¥1^^1(6), (59) 



x 



Assumption d50b implies that the exponent in d59l ) tends to 
— oo so that d57l i holds. Therefore it suffices to prove (|39i l 
under the condition that for all sufficiently large n the random 
events UjE n j(b) fail to take place, i.e. that 

Pj < b max {pj,qj} for all 1 < j < k. (60) 

Let us start with the fact that under d60l > it holds £j < 

{bpj,bqj} and then 



-2 ^ ia-2 n-2 , m-2 Q 



£"- 2 < (max {bpjMj}) a ~* < b a -' 2 P «- 2 +b° 



k a-2- 



(61) 



Applying this in the Taylor formula ( I56l l we obtain 



I a\ ^ a — 1 i 

\Pj ~Pj I < a Pj \Pj -Pj\ 

a(a - l)b a - 2 



pT 2 + €^M%-Pi) 2 



k 



Hence under ( f6Qb we get from (f5TJ and Lemma [8] 



However, by Schwarz inequality and ((55), 



E^~ 1 = E^ (p5 

3=1 3=1 



a _n(a-2)/(a-l) 



(a-2)/(a-l) / \ (a-2)/(a-l) 



'E/v<; 1 

v3'=l 



E*>i 
3=1 

Q _1 X (a-2)/(a-l) 



so that the validity of fl39l > under ( T50b is obvious and the proof 
is complete. 



Condition l50l is stronger than Condition [381 and implies that 
for any fixed number a > eventually any bins will contain 
more than a observations. 



k ~ 



A:"- 1 a(a - l)6 a " 2 

(a - 1) ^ 2 

v ; .7=1 



^- 9 + ^)'Ai-yv 2 



Applying (1551 1 and using Jensen's inequality and the expecta- 
tion bound d43l , we upper bound E |A Q „| by 



(a(a-l)A + l) 1/2 (EPj~ V 



1/2 



a (a — 1) 



m-27,a!-l , - 

E^r 2 +|=j)E[(Pi-ft) 2 ] 



nPL-2 



< 



(a(a-l)A + l) 1/2 /EP Q_1 



1/2 



a (a — 1) \ k 1 a n I 

2 I ^3 f, a -2 I n 



(a(a-l)A + l) 1/2 /Ep Q_1 



1/2 



a (a — 1) \ /c 1 Q n 

+ 



(62) 



b a-2 k a ~ l Y!; =l p^- X b a - 2 g a - 2 k 



2 n 

Obviously, under ( f6Qb the desired relation fl39l > holds if the 
assumption < T50b implies the convergence 



a-l 



->• 0. 



(63) 



IV. Bahadur efficiency 

In this section we study the Bahadur efficiency in the class 
of power divergence statistics Z? Q ,n = D a (P n ,Q n ), a > 0. 
As before, we use the simplified notations 

P n = P, Qn = Q and k n = k. 

The results are concentrated in Theorem [13] below. Its proof 
is based on the following lemmas. The first two of them make 
use of the Renyi divergences of orders a > 



k 

D a (P\\Q) = -^- i lnJ2 P ^- a , 

3=1 



Dt (P\\Q) = lim D a (P\\Q) = D (P\\Q) 

Oi— >1 

where D (P\\Q) is the classical information divergence de- 
noted above by D\ (P,Q). There is a monotone relationship 
between the Renyi and power divergences given by the for- 
mula 



D a (P\\Q) 



1 



a — 1 

D 1 (P\\Q) = D 1 (P,Q) 



ln(l + a(a-l)£> Q (P,Q)), (64) 

(65) 



Lemma 10: Let P and Q be probability vectors on the set 
A". If a < (3 then 

D a [P\\Q)<D P (P\\Q). 

with equality if and only there exists a subset A C X such 
that P = Q(-\A). 
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Proof: By Jensen's inequality 

k 




/3-l N 



0-1 



0-1 



3 = 1 

= D P (P\\Q). 

The equality takes place if and only if y^p 

P-almost surely. Therefore £f is constant on the support of P 
that we shall denote A. Now P equals Q conditioned on A. 



is constant 



Lemma 11: Let < a < 1. If 

n 



kin 



oo. 



n 



(66) 



and q max — > as n — > oo then the statistic !)„ „ is Bahadur 
stable and consistent and the constant sequence generates the 
Bahadur function 



g a (A) = 

f In (1 + a(a - 1) A) 
a- 1 ' 
lirn g Q (A) = A, 

a— >1 



A > when < a < 1 
A > when a = 1. 



(67) 



Proof: Let us first consider < a < 1. The minimum of 
Pi(P, Q) given P Q (P||Q) > A is lower bounded by A. Let 
e > be given. If <7 max is sufficiently small there exist sets 
A- C such that 

-lnQ(A+) < A < -lnQ(A_) < A + e. 

Let P s denote the mixture (1 - s) Q (• | A+) + sQ (• | A_) . 
Then s — > D Q (PsHQ) is a continuous function satisfying 

D a (PollQ) < A, 

5a (Pi ||Q) > A. 

In particular there exist s G [0, 1] such that D a (P S \\Q) = A. 
For this s we have 

Di (P S ,Q) 

< (1 - s) Pi (Q (• | A+) , Q) + sP»a | Q) 
= (1 - s) (-InQ (A+)) + s (-InQ 

< (1 - s) A + s (A + e) = A + e. 

Hence 

A < inf D 1 (P, Q) < A + e 

where the infimum is taken over all P satisfying D a (P||<5) = 
A and where n is sufficiently large. This holds for all e > so 



the Bahadur function of the statistic D a ^P||Q^ is g (A) = 
A. The Bahadur function of the power divergence statistics 
Da ( P, Q ) can be calculated using Equality [64] ■ 



Lemma 12: Let a > 1. If assumptions A(a) holds for for 
the uniform distributions Q n = U and the sequence 



c a (n) 



k (a-l)/c 

Ink 



satisfies the condition 



c a (n) k\nn 



(68) 



(69) 



then the statistic P Qi „ = D a (P n ,Q n ) is consistent and the 
sequence d68l ) generates the Bahadur function 



5Q (A) = (a(a-l)A) 1/Q , A > 0. 



(70) 



Proof: If the sequence ( I68t satisfies ( 1691 ) then Theorem 
1 implies the consistency of D a ^ n . Formula d70l ) was already 
mentioned in Example 2 above with a reference to Harremoes 
and Vajda 0). ■ 

Theorem 13: Let the assumption A(ai, a-i) hold where < 
ai < a 2 - If 

k In ?i 







(71) 



then the statistics 



Pcti,n — PcKi(Pi)Qn)i 
Dct 2j n — Pct2(PllQn) 



satisfy the relation 



BE ( Doti,n-> D a2 n 



a 2 - 1 In (1 + a 1 (a 1 - l)A Ql ) 
a\ - 1 In (1 + a 2 (a 2 - 1) A Q2 ) 

1 ln(l + ai(ai - l)A Ql ) 



for «2 < 1 



ai — 1 



for «2 = 1- 



(72) 



If 



fc 2-l/a 2 , n) 



(73) 



then the statistics P Ql! „ = D ai (P n ,U) and D a2iTl 
D a2 (P„ , P) satisfy the relation 



BE 



n i D a2 ?n 



oo for a 2 > 1. 



(74) 



Proof: By Lemma [TT] the assumptions of Definition [5] 
hold. The first assertion follows directly from Definition 3 
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since, by Lemma [TT1 

gaJAai) = 

ff"2 ( Aa 2 ) 

a2-l ln(l + ai(ai - l)A ai ) 
ai — 1 ln(l + a 2 (a 2 - 1) A aa ) 

1 ln(l + ai(ai - l)A ai ) 



when a2 < 1 



Hence, by Definition 4, the likelihood ratio statistic Di. n is 
as Bahadur efficient as any D a n with < a < 1. If a > 1 
then Lemma [T2l implies 



In 2 



> 1. 



ai — 1 



when «2 = 1. 



(75) 



The second assertion was for qi = 1 deduced in Section 2 
from the lemmas presented there. The argument was based on 
the fact that c ai (n) = 1 for ot\ = 1. But c a (n) = 1 for all 
< a < 1 so that extension from a% = 1 to < < 1 is 
straightforward. ■ 



9i(Ai) 

However, contrary to this prevalence of g a (A a ) over gi(Ai), 
Theorem [13] implies that Di. n is infinitely more Bahadur 
efficient than D a . n . 

Example 15: Let us now consider the truncated geometric 
distribution 

Pn = (Pnl, ■ ■ ■ ,Pnk) = Cfc(p)(l,p, ••■,/) 

with parameter p = p n G]0, 1[. Since 



Example 14: Let 



Pn I Pnj 



def 1 {l<j<fc/2} 



[k/2\ 



71=1,2, 



(76) 



where 1^ is the indicator function, |_-J stands for the integer 
part (floor function) and, as before, 

U = Uj d = l/k :l<j<k). 
Then for a^0,l 

T,l U j ((Pnj/Uj) a - a (pnj/Uj - 1) - 1) 



D a (P n ,U) 



El Pnj U j 



a(a — 1) 

El(P«i-%)-El 



a(a — 1) 

ja-l£L*/2J [fc/2j"" - 1 



a(a — 1) 
fc"- 1 Lfc/2j/[fc/2j"-l 
a (a — 1) 

(fc/ Lfc/sjr 1 -! 



a(a — 1) 

Therefore the identifiably condition (fT9b takes on the form 



D a (P n ,U) 



1 de/ 



a(a — 1) 



In 2 = f A, 



, if a > 0, «/l 
if a = 1. 



If < a < 1 then Lemma [T2l implies 

ff„(A) =ln(l + a(a-l)A)/(a-l) 

when < a < 1 and 31(A) = A when a = 1. If moreover 
d72l then under the alternative d76l l 

3a (A Q 



In 1 + a(a - 1) V— rr 



fl i(Ai) (a-l)In2 
ln(l + 2 Q - 1 - 1) 
(a - l)ln2 



= 1. 



1 + 

it holds 

1 +p + 



1-p 



and + p k+2 



1 



P = 



1 -p k 



fc+i 



n fc+i 



1-p' 



1 



1-P 1-P 1-P Cfe(p)' 



Hence for all a 7^ 0, 1 



1 ^ k a (i- P ) a p a i 

kp^ (l-p*+ 1 ) Qt 

wl ~Cx>>' 



^(i _ p fe+i) C 



jfe(l_P*+l)a 

_ (A:(l-p)) a 1 -p"( fc + 1 ) 
~~ jfc(l ' (l-p k+1 ) a ' 

In the particular case p = 1 — x/k for a; ^ fixed we get 
k(l — p) — x and 



P 



a(fc+l) 



P' 



fc+1 



= M 


f ^ 


-(' 


X 

~ k 




X 

~ k 



ax, 



a(k+l) 
fc+1 



Therefore 

a(a — 1) D at7l + 1 



1 - e 



fc(lf+o(§)) (l-e- x ) a 
x a e xa - 1 



ax + o(x) (e x - l) a ' 



Consequently, 

a(a - 1) A a + 1 

i.e., 

A, 



„ck— 1 „xa 



a (e x - 1)° 



x a - 1 (e Ba - l)-a(e x - l) c 



a 2 (a - l)(e x - l) a 



for a / 0,1. 



11 



By the L' Hospital rule, 
Ai = In 



+ 



A 



e{e x — 1) e x — 1 
ln(e x — 1) — In x 



From here one can deduce that if x — > then 



A Q 



for all a € 



If x 



and 



1 then 

Aq, 

Ai 
A 



1 



a 2 {a- l).(e- l) a 
1 - (e - 1) ln(e - 1) 



e - 1 



for a ^ 0, 1 



= 0.035, 



ln(e - 1) 



= 0.271. 



Using Lemma 2 and Theorem 2 in a similar manner as in the 
previous example, we find that here D\ n is more Bahadur 
efficient as any D ay „ with < a < oo, a ^ 1. 

V. Contiguity 

In this paper we proved that the statistics D a , n of orders 
a > 1 are less Bahadur efficient than those of the orders 
< a < 1 and that the latter are mutually comparable in 
the Bahadur sense. One may have expected Di n to be much 
more Bahadur efficient than D a>n for < a < 1. In order 
to understand why this is not the case we have to examine 
somewhat closer the assumptions of our theory. 

Recall that given a sequence of pairs of probability measures 
(P n , Qn) neN , (Pn) neN is said to be contiguous with respect 
to (Qn) ne N if Qn {A n ) -+ for n -> oo implies P„ (A n ) 
for n — > oo and any sequence of sets {A n ) nef!l . When 
is contiguous with respect to (Q n ) r ^ we write 



{Pn) 

Pn < Qn- Let P and Q be probability measures on the 
same set X and let (J r re ) ngN be an increasing sequence of 
finite sub-er-algebras on X that generates the full a-algebra 
on X. If P n = P\jr n and Q n = Q\jr n then P n < Qn if 
and only if P <C Q where <C denotes absolute continuity. 
For completeness we give the proof of the following simple 
proposition. 

Proposition 16: Let (P n ,Qn) n£ j$ denote a sequence of 
pairs of probability measures and assume that the sequence 
Di {Pn, Qn) is bounded. Then P„ < Q n . 

Proof: Assume that the proposition is false. Then there 
exist £ > and a subsequence of sets {A nk ) keN such that 
Qn k {A nk ) -> for k -> oo and P„ fc {A nk ) > £ for all k £ N. 

■ 

In general, a large power a makes the power divergence 
D a {P,Q) sensitive to large values of dP/dQ. Therefore 
the statistics D atn with large a should be used when the 
sequence of alternatives P n may not be contiguous with 
respect to the sequence of hypotheses Q n . Conversely, a 
small power a makes D a {P,Q) sensitive to small values of 
dP/dQ. Therefore D a n with small a should be used when 



the sequence of hypotheses Q n is not contiguous with respect 
to the sequence alternatives P n . Our conditions guarantee 
Pn < Qn but not the reversed contiguity Q n < P n . We see 
that a substantial modification of the conditions is needed in 
order to guarantee that D\. n dominates the divergence statistcs 
D a ,n of the orders < a < 1 in the Bahadur sense. 

VI. Appendix: Relations to previous results 

As mentioned at the end of Section II, Harremoes and Vajda 
assumed the same strong consistency as in Definition 4+ 
but introduced the Bahadur efficiency by the formula ( f36b . The 
next four lemmas help to clarify the relation between this and 
the present precised concept of Bahadur efficiency ( |35| ). 

Under the assumptions of Definition 4, [5 | considered the 
following conditions. 

CI: The limit c Q2 / Q1 considered in (l37l i exists. 
C2: Both statistics D ai>n are strongly consistent and both 
functions g ai are strongly Bahadur. 

Lemma 1 7: Let the assumptions of Definition|5]hold. Under 
CI the Bahadur efficiency (f36b coincides with the present 
Bahadur efficiency (|35T >. If moreover C2 holds then d36*l > is 
the Bahadur efficiency in the strong sense. 

Proof: The first assertion is clear from (l36*i l and (F35b . 
Under C2 the assumptions of Definition 3+ hold. Hence the 
second assertion follows from Definition [6] ■ 

Lemma 18: Let the assumptions of Definition 3 hold and 
let b{a) : T — >-]0, 1[ be increasing and d a ■ I — >]0,oo[ 
arbitrary function on an interval I covering {ai,a2}. If 
for every a G {ai,a2} the sequence c a {n) generating the 
Bahadur function g a satisfies the asymptotic condition 



c a {n)=n b ^\d a +o{l)) 



(77) 



then OTb holds for c a2 / ai = oo and condition CI is satisfied. 
Proof: Under (fTTT i it suffices to prove that d3"TT l holds for 



Cq 2 / Qi = oo, i.e. 



n . Cq- 2 [Ul n 1 

hm = oo 



for m„ defined by (g9). By d77l . 

c Qa (m n ) =m b n ^{d a2 +o(l)) 

and 

c ai {n)=n h{ai) {d ai +o{l)) 



(78) 



so that d29t implies 



-b(a 2 ) _ n l-6(ai) 



for the finite positive constants 
S = — — - and - 



(7* + o(l)) 



gai( A ai) 

g a2 {A a2 ) ' 
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Hence ( f30l > implies 

c a2 {m n ) m ri 



c ai (n) 



(7- X + o(l)) 



n 

Jl !- b ! n 2 ) 



-(( 7 5)T^W 7 - 1 +o(l)) 

n 



= n i-i, (Q2 ) ( 7 i-b (Q2)( 5i-b( Q2 ) + G (x)) 

so that (|78) holds. ■ 

Lemma 19: Let the assumptions of Definition [5] hold and 
let for every a € {ax, 0^2} the sequence Ca,(n) generating the 
Bahadur function g a satisfy the asymptotic condition 



c a (n) 



Ma) 



In n 



(79) 



for some increasing function b(a) : I — >]0, 1[ on an interval 
I covering {0.1,0.2} ■ Then (f3TT > holds for c a2 / ai — 00 and 
condition CI is satisfied. 

Proof: Similarly as before, it suffices to prove the relation 
d78]l for m n defined by (f29]l. By d79}, 



C a2 (?71 n ) 



a^ra 



b(a 2 



km, 

so that d29l implies 

1-6(q 2 ) 

In m n 



and c Ql (n) 



ot\n 



b(ai) 



In / 



l-6(ai) 



In ) 



-(7 + o(l)) 



for the same 7 as in the previous proof. Since 1 — 6(02) < 
1 — b(ai), this implies the asymptotic relation 



n 



(80) 



ailnm„) 1_i,( ° 2 ' b (°2)-K°i) 

I 7} l-b(a 2 ) 

q;2 Inn 



Similarly as in the previous proof, we get from ( l30t 

c Q2 (mn) = m, 1+ , 
c Ql (ri) n 

^ 6(02) \ 

<y l-i>(a 2 ) \ 

v +°w ) 

b(a 2 )-H<xi) H»2) 
> n l-H«2) ( 7 l-i.(»2) + o(l)). 

Therefore the desired relation (fTBl holds. ■ 

Lemma 20: Let the assumptions of Definition 3 hold and 
let for every a € {ai, 0^2} the sequence c a (n) generating the 
Bahadur function g a satisfy the asymptotic condition 



c Q (n) 



In k 



(81) 



where k = k n — > 00 is the sequence considered above 
and b(a) : I — !►](), 00 [ is increasing on an interval X 
covering { a 1, a 2 } ■ Then (T3TT > holds for c a2 / ai — 00 and 
condition CI is satisfied. 

Proof: It suffices to apply Lemma [TTI to the sequences 



aifc b ( Ql ) 

c ai (k) = — — — and c a2 (m k ) 
In k 



b(a 2 ) 

Inmfe 



for mk defined by the condition 

w fc _ g ai (A Ql ) fc 
c a2 (m k ) g a2 (A a2 ) c ai (k) 



(1 + o(l)) (cf. (|29)). (82) 



Example 21: Let assumptions of Definition[5]hold for a\ 
1 and 02 = a > 1, and let 



for 6(a) = (a- I) /a. 



fc b («) +1 lnn 



By Q5j. Eq. 51, 76 and 79] and ( 1831 1 the sequences 

. , ak b ^ 
ci(n) = 1 and c a (n) = 



Ink 



(83) 



(84) 



generate the Bahadur functions 

31(A) = A and g a (A) = (a(a - 1) A) 



l/a 



A > 0. 

(85) 

Here we cannot apply Lemma [18] since c\ (n) is not special 
case of c a (n) for a = 1. An alternative direct approach 
can be based on the observation that ( |29l cannot hold if 
liminf n m„ < 00. In the opposite case m n — > 00 obviously 
implies 

def C a (m n ) 
C a /i = .1111,, ■ 



so that CI holds with c, 



a 2 /ai 



ci (n) 



"a/1 



00. Hence Lemma 



1 implies that the Bahadur efficiency BE [Di,n > D a>Tl 

00 obtained previously by Harremoes and Vajda Eq. 81] 
coincides with the Bahadur efficiency of D\, n with respect 
to D a n in the present precised sense of (f35T >. Under stronger 
condition on k than (l83l . Harremoes and Vajda established 
also the strong consistency of the statistics D% >n and D a ^ n . 
One can verify that (l85l l are strongly Bahadur functions so 
that C2 holds as well. Hence, as argued by Lemma 3, we deal 
here with the Bahadur efficiency in the strong sense. 

Example 22: Let assumptions of Definition[5]hold for qj > 

1 and let the function b(a) be defined by ( f83b for all a > 1. 
Harremoes and Vajda (2008) proved that if the sequence k 
satisfies the condition d83l with a = a 2 then for all a 6 
{ai, «2} the function g a (A) given by the second formula in 
( |85| > is Bahadur function of the statistics -D Qin generated by 
the sequences c a (n) from the second formula in d84l >. Thus in 
this case the assumptions of Lemma [18] hold. From Lemmas 
l20l and [T7l we conclude that the Bahadur efficiency 



^L ^Qi.n 7 D a2 n*j 



= 00 for all < «i < a 2 < 00 



obtained in [5 Eq. 81] coincides with the Bahadur efficiency in 
the present precise sense. Similarly as in the previous example, 
we can arrive to the conclusion that this is the Bahadur 
efficiency in the strong sense. 
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